On Identifying User Session Boundaries in Parallel Workload Logs

نویسندگان

  • Netanel Zakay
  • Dror G. Feitelson
چکیده

The stream of jobs submitted to a parallel supercomputer is actually the interleaving of many streams from different users, each of which is composed of sessions. Identifying and characterizing the sessions is important in the context of workload modeling, especially if a userbased workload model is considered. Traditionally, sessions have been delimited by long think times, that is, by intervals of more than, say, 20 minutes from the termination of one job to the submittal of the next job. We show that such a definition is problematic in this context, because jobs may be extremely long. As a result of including each job’s execution in the session, we may get unrealistically long sessions, and indeed, users most probably do not always stay connected and wait for the termination of long jobs. We therefore suggest that sessions be identified based on proven user activity, namely the submittal of new jobs, regardless of how long they run.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysing Web Search Logs to Determine Session Boundaries for User-Oriented Learning

Incremental learning approaches based on user search activities provide a means of building adaptive information retrieval systems. To develop more effective user-oriented learning techniques for the Web, we need to be able to identify a meaningful session unit from which we can learn. Without this, we run a high risk of grouping together activities that are unrelated or perhaps not from the sa...

متن کامل

User Modeling of Parallel Workloads

The goal of workload modeling is to simulate the expected workload, accurately enough to enable making correct design and administrative decisions. Several statistical features of production parallel computer workloads, which are not embodied in current models, have been identified. Their practical importance is demonstrated by two new kinds of schedulers – a key component in determining the ov...

متن کامل

تشخیص ناهنجاری روی وب از طریق ایجاد پروفایل کاربرد دسترسی

Due to increasing in cyber-attacks, the need for web servers attack detection technique has drawn attentions today. Unfortunately, many available security solutions are inefficient in identifying web-based attacks. The main aim of this study is to detect abnormal web navigations based on web usage profiles. In this paper, comparing scrolling behavior of a normal user with an attacker, and simu...

متن کامل

Detecting session boundaries from Web user logs

Detecting session boundaries on the Web is important for several reasons. Firstly, it is important to establish a common context for various statistics relating to user sessions and frequency of user activities. More specifically, it is important to detect some boundaries in order to group related information together for other applications, such as learning techniques for adaptive search engin...

متن کامل

Identification of User Sessions with Hierarchical Agglomerative Clustering

We introduce a novel approach to identifying Web search user sessions based on the burstiness of users’ activity. Our method is user-centered rather than population-centered or system-centered and can be deployed in situations in which users choose to withhold personal content information. We adopt a hierarchical agglomerative clustering approach with a stopping criterion that is statistically ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012